Cross-Domain Perceptual Reward Functions

نویسندگان

  • Ashley D. Edwards
  • Charles Lee Isbell
چکیده

In reinforcement learning, we often define goals by specifying rewards within desirable states. One problem with this approach is that we typically need to redefine the rewards each time the goal changes, which often requires some understanding of the solution in the agent’s environment. When humans are learning to complete tasks, we regularly utilize alternative sources that guide our understanding of the problem. Such task representations allow one to specify goals on their own terms, thus providing specifications that can be appropriately interpreted across various environments. This motivates our own work, in which we represent goals in environments that are different from the agent’s. We introduce Cross-Domain Perceptual Reward (CDPR) functions, learned rewards that represent the visual similarity between an agent’s state and a cross-domain goal image. We report results for learning the CDPRs with a deep neural network and using them to solve two tasks with deep reinforcement learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reward Sharpens Orientation Coding Independently of Attention

It has long been known that rewarding improves performance. However it is unclear whether this is due to high level modulations in the output modules of associated neural systems or due to low level mechanisms favoring more "generous" inputs? Some recent studies suggest that primary sensory areas, including V1 and A1, may form part of the circuitry of reward-based modulations, but there is no d...

متن کامل

COVARIANCE MATRIX OF MULTIVARIATE REWARD PROCESSES WITH NONLINEAR REWARD FUNCTIONS

Multivariate reward processes with reward functions of constant rates, defined on a semi-Markov process, first were studied by Masuda and Sumita, 1991. Reward processes with nonlinear reward functions were introduced in Soltani, 1996. In this work we study a multivariate process , , where are reward processes with nonlinear reward functions respectively. The Laplace transform of the covar...

متن کامل

Cross-modal effects of value on perceptual acuity and stimulus encoding.

Cross-modal interactions are very common in perception. An important feature of many perceptual stimuli is their reward-predicting properties, the utilization of which is essential for adaptive behavior. What is unknown is whether reward associations in one sensory modality influence perception of stimuli in another modality. Here we show that auditory stimuli with high-reward associations incr...

متن کامل

Distinct dynamics of ramping activity in the frontal cortex and caudate nucleus in monkeys.

The prefronto-striatal network is involved in many cognitive functions, including perceptual decision making and reward-modulated behaviors. For well-trained subjects, neural responses frequently show similar patterns in the prefrontal cortex and striatum, making it difficult to tease apart distinct regional contributions. Here I show that, despite similar mean firing rate patterns, prefrontal ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1705.09045  شماره 

صفحات  -

تاریخ انتشار 2017